Distance Preserving Mapping from Categories to Numbers for Indexing
نویسندگان
چکیده
Memory-Based Reasoning and K-Nearest Neighbor Searching are frequently adopted data mining techniques. But, they suffer from scalability. Indexing is a promising solution. However, it is difficult to index categorical attributes, since there does not exist linear ordering property among categories in a nominal attribute. In this paper, we proposed heuristic algorithms to map categories to numbers. Distance relationships among categories are preserved as many as possible. We empirically studied the performance of the algorithms under different distance situations.
منابع مشابه
Conformal mappings preserving the Einstein tensor of Weyl manifolds
In this paper, we obtain a necessary and sufficient condition for a conformal mapping between two Weyl manifolds to preserve Einstein tensor. Then we prove that some basic curvature tensors of $W_n$ are preserved by such a conformal mapping if and only if the covector field of the mapping is locally a gradient. Also, we obtained the relation between the scalar curvatures of the Weyl manifolds r...
متن کاملA new attitude coupled with the basic fuzzy thinking to distance between two fuzzy numbers
Fuzzy measures are suitable in analyzing human subjective evaluation processes. Several different strategies have been proposed for distance of fuzzy numbers. The distances introduced for fuzzy numbers can be categorized in two groups:\1. The crisp distances which explain crisp values for the distance between two fuzzy numbers.\2. The fuzzy distance which introduce a fuzzy distance for normal f...
متن کاملComparing Bivariate and Multivariate Methods in Landslide Sustainability Mapping: A Case Study of Chelchay Watershed
1- INTRODUCTION In the last decades, due to human interventions and the effect of natural factors, the occurrence of landslide increased especially in the north of Iran, where the amount of rainfall is suitable for the landslide occurrence. In order to manage and mitigate the damages caused by landslide, the potential landslide-prone areas should be identified. In landslide susceptibili...
متن کاملMultidimensional Indexing Structures for Content-based Retrieval
LIMITED DISTRIBUTION NOTICE: This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and sp...
متن کاملSupervised Locality Preserving Indexing for Text Categorization
A major characteristic of text categorization problems is the prohibitive high dimensionality of the feature space. Most discrimination methods can not work in such a condition, Latent Semantic Indexing (LSI) has been adopted to solve this problem. However, LSI is not an optimal representation for text categorization task mainly because of two reasons: first, the discriminative categorical info...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004